Wiktionnaire's Wikicode GLAWIfied: a Workable French Machine-Readable Dictionary

نویسندگان

  • Nabil Hathout
  • Franck Sajous
چکیده

GLAWI is a free, large-scale and versatile Machine-Readable Dictionary (MRD) that has been extracted from the French language edition of Wiktionary, called Wiktionnaire. In (Sajous and Hathout, 2015), we introduced GLAWI, gave the rationale behind the creation of this lexicographic resource and described the extraction process, focusing on the conversion and standardization of the heterogeneous data provided by this collaborative dictionary. In the current article, we describe the content of GLAWI and illustrate how it is structured. We also suggest various applications, ranging from linguistic studies, NLP applications to psycholinguistic experimentation. They all can take advantage of the diversity of the lexical knowledge available in GLAWI. Besides this diversity and extensive lexical coverage, GLAWI is also remarkable because it is the only free lexical resource of contemporary French that contains definitions. This unique material opens way to the renewal of MRD-based methods, notably the automated extraction and acquisition of semantic relations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Corpus and Machine - ReadableDictionary Data for Building Bilingual

This paper describes and discusses some theoretical and practical problems arising from developing a system to combine the structured but incomplete information from machine readable dictionaries (MRDs) with the unstructured but more complete information available in corpora for the creation of a bilingual lexical data base, presenting a methodology to integrate information from both sources in...

متن کامل

Corpus and dictionary development for classifiers/quantifiers towards a French-Japanese machine translation

Although quantifiers/classifiers expressions occur frequently in everyday communications or written documents, there is no description for them in classical bilingual paper dictionaries, nor in machine-readable dictionaries. The paper describes a corpus and dictionary development for quantifiers/classifiers, and their usage in the framework of French-Japanese machine translation (MT). They ofte...

متن کامل

Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation

This paper presents a method to combine a set of unsupervised algorithms that can accurately disambiguate word senses in a large, completely untagged corpus. Although most of the techniques for word sense resolution have been presented as stand-alone, it is our belief that full-fledged lexical ambiguity resolution should combine several information sources and techniques. The set of techniques ...

متن کامل

Sense-Linking in a Machine Readable Dictionary

Dictionaries contain a rich set of relationships between their senses, but often these relationships are only implicit. We report on our experiments to automatically identify links between the senses in a machine-readable dictionary. In particular, we automatically identify instances of zero-aax morphology, and use that information to nd speciic linkages between senses. This work has provided i...

متن کامل

Learning to Augment a Machine-Readable Dictionary

Dictionaries will always be incomplete; sometimes a word will acquire a new sense in a technical eld, and new words are being added to the language all the time. This paper will discuss our comparisons between a machine-readable dictionary and various information retrieval test collections. We will rst report on the number of words found in the dictionary, and how much improvement is gained by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016